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Abstract 

The problem of searching for an unknown object occurs in important applications rang¬ 
ing from security, medicine and defense. Sensors with the capability to process information 
rapidly require adaptive algorithms to control their search in response to noisy observations. In 
this paper, we discuss classes of dynamic, adaptive search problems, and formulate the result¬ 
ing sensor control problems as stochastic control problems with imperfect information, based 
on previous work on noisy search problems. The structure of these problems, with objective 
functions related to information entropy, allows for a complete characterization of the optimal 
strategies and the optimal cost for the resulting finite-horizon stochastic control problems. We 
study the problem where an individual sensor is capable of searching over multiple sub-regions 
in a time, and provide a constructive algorithm for determining optimal policies in real time 
based on convex optimization. We also study the problem in which there are multiple sensors, 
each of which is only capable of detecting over one sub-region in a time, jointly searching for 
an object. Whereas this can be viewed as a special case of our multi-region results, we show 
that the computation of optimal policies can be decoupled into single-sensor individual scalar 
convex optimization problems, and provide simple symmetry conditions where the solutions can 
be determined analytically. We also consider the case where individual sensors can select the 
accuracy of their sensing modes with different costs, and derive optimal strategies for these 
problems in terms of the solutions of scalar convex optimization problems. We illustrate our 
results with experiments using multiple sensors searching for a single object. 


1 Introduction 

The proliferation of intelligent sensors in diverse applications from building security, defense, trans¬ 
portation and medicine has created a need for automated processing and deduction of sensor infor¬ 
mation. An important problem in these sensor systems is the detection and localization of objects 
of interest. Intelligent sensors are able to control the nature of information collected by changing 
their field of view and their sensing parameters; ideally, they should do so adaptively, exploiting 
what has been learned from previous observations, to improve the accuracy in detection and lo¬ 
calization. In this paper, we focus on the problem of developing adaptive search policies for a 
stationary object in a compact domain, with sensors that provide noisy information regarding the 
presence of the object in the field of view. 

The field of search theory has a long history, dating back to its early application for locating 
submarines and objects at sea in the 1940 ’s mm- The search problem was formulated as an 
optimal allocation of search effort to look for a single stationary object with a single imperfect 
sensor I3I1I5]. The sensor detects the presence of the object, with a simple sensor error model, a 
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probability of missed detection (but not a corresponding probability of a false alarm). The resulting 
search strategies were open-loop search plans, which continued until an output of “detected” was 
returned. The limitations of this approach were that the sensor measurements could produce no 
false alarms, and that resulting search strategies were non-adaptive. Extensions of search theory to 
more complex error models that require adaptive feedback strategies have been developed in some 
restricted contexts [H] where a single sensor can observe one of many possible discrete locations at 
each time. 

In the presence of complex noise models, the adaptive search problem can be viewed as a 
problem of sensor management, looking for optimal controlled sensing policies [3- There are many 
applications of sensor management techniques that develop adaptive strategies for different sensing 
problems, including function estimation [8], image acquisition [9], object classification [iniiiiiiia 
[13], object tracking 116] . These problems can be formulated as instances of stochastic 

control problems m and sequential experiment design [181 [13 EO] . However, exact solution of 
these stochastic dynamic programming problems is computationally prohibitive, so most of these 
adaptive techniques use heuristics and approximations such as model-predictive control to obtain 
strategies with manageable computation complexity. 

Our formulation to the adaptive search problem is based on the approach proposed by Jedynak 
et al. |21j . In their work, Jedynak et ah [2T] considered the problem of localizing a stationary 
object in an Euclidean space by using a single sensor that asks a sequence of yes/no questions, each 
of which asks whether the object is located in a region specified by the sensor. We refer to the 
questions as sensing modes in our paper, and the decisions are made on selecting the sensing modes. 
The sensor observes a Boolean value corresponding to whether or not the object is localized in the 
inquired region, but this yes/no value is corrupted by noise at the output of the sensor. We refer to 
such a sensor as a Boolean sensor. Jedynak et al. formulated the optimal Boolean sensing problem 
to optimize the posterior differential entropy of the object location after a fixed finite number of 
measurements, and showed the existence of optimal strategies as well as explicit constructions for 
the optimal adaptive strategies for several variations of this problem |21j . 

The problem studied in [21] has its roots in information theory known as the Renyi-Ulam game 
|22j . Horstein [23] developed a probabilistic bisection scheme for the noisy version of this problem 
in the context of sequential decoding. Burnashev and Zigangirov [2Tj developed an algorithm 
for the case where the possible query locations are discrete and showed asymptotic decay in the 
probability of location error. Nowak [25] proposed a generalized binary search algorithm to search 
over a discrete location space. The Renyi-Ulam game with adversarial errors was studied in [26] 127] . 

Recently, the work in m has been generalized in several directions. Sznitman et al. [28] 
considered the case where the sensors can choose different types of observations with different 
costs, with application to problems in electron microscopy. Tsiligkaridis et al. [29] considered the 
problem of multiple Boolean sensors performing collaborative search, where each sensor observes 
a noisy measurement of the Boolean indicator that the object is contained in the observed subset. 
They developed characterizations of optimal strategies for the multi-sensor case. Focusing on the 
case where each sensor has a binary symmetric error model, they provided explicit analytic solutions 
for the optimal multistage adaptive sensing policies. In subsequent work 133, they extended their 
strategies to the decentralized case where sensors do not know each other’s error models, but 
exchange local estimates of the conditional density of the object location, and provide a consensus 
algorithm where neighboring sensors exchange these local estimates to arrive at a common estimate 
of the conditional density of the object location. 

In this paper, we generalize the Boolean single-sensor search problem of |2T] to a pair of different 
search problems: First, we consider the multi-region single-sensor search problem where the sensor 
can partition the object space into multiple regions and inquire as to which region the object is 
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located in. Second, we consider extensions of the Boolean multi-sensor search problem where there 
are multiple sensors working simultaneously as a team, similar to the problem considered in |29j . 
but using more general error models. The second problem can be viewed as a special case of the 
first problem with some added structure. We adopt a Bayes formulation similar to that in |21j . 
using general sensor error models, with the goal of reducing the final entropy of the conditional 
probability density of the object location after a known fixed number of observations. We pose the 
first problem as a stochastic control problem, and derive a complete characterization of optimal 
adaptive strategies. We also provide a constructive algorithm for computing the optimal strategies 
based on convex optimization, and show that the optimal strategies are independent of the problem 
horizon. We further derive a lower bound on the performance of the minimum mean-square error 
estimator. 

For the Boolean multi-sensor search problem, we show the equivalence of this problem to our 
multiregion search problem, thereby establishing a characterization of optimal policies and the 
optimal cost. We further show that the optimal sensing strategies can be obtained in terms of 
the solution of decoupled scalar convex optimization problems, by showing that the optimal joint 
policies have a special factorization structure that can be obtained from the solution of individual 
Boolean single sensor problems. We also show the equivalence in expected performance between 
a system where multiple sensors collect information simultaneously, and one where sensors collect 
information sequentially among sensors, with information from each sensor shared instantaneously 
so it can be incorporated into the choice of other sensors’ actions. We describe a generalized 
symmetry condition for non-binary error models that enables the analytic solution of the joint 
sensing problem, and provide a constructive solution for generating the optimal adaptive sensing 
strategies. 

As a further extension, we consider the scenario where each Boolean sensor is also allowed to 
choose among error models for their observations at different costs, extending the results of [28] to 
the multi-sensor case. We derive the optimal policies for this problem of costly Boolean multi-sensor 
search. We provide explicit solutions for the optimal value function, and show that the optimal 
strategies can again be computed in terms of the solution of single-sensor problems. We provide 
experiments with two and three sensor simulations that illustrate the performance of our sensing 
strategies. 

The results of this paper are a significant extension of our previous results reported in EH, 
which focused mostly on the single sensor multiregion search problem. Even for that problem, our 
exposition presents a more rigorous treatment of the optimality conditions with full proofs, a new 
symmetry condition that enables analytic solution for determining optimal strategies, and a lower 
bound on the performance of mean square estimation error. The results in this paper illustrate 
how the structure of entropy-based objectives can lead to complete characterizations of optimal 
adaptive sensing strategies, along with practical algorithms for computation of such strategies. 

The paper is structured as follows: In Section we study the multi-region single-sensor search 
problem. We describe the problem formulation, and derive the optimal policies for this model. 
Based on the optimal cost, we provide a the lower bound on the covariance of the minimum mean- 
square error estimator of the object’s unknown location. In Section]^ we study the Boolean multi¬ 
sensor search problem with general sensor error models. We describe the model formulation, and 
develop the optimal solution of the model, similar to the results of [29]. We show that, for general 
discrete error models, the optimal policies can be determined through the solution of decoupled 
single sensor problems, and provide a simple construction for those problems. In Section [^ we 
study the multi-sensor search problem where sensors can control both the choice of sensing mode 
as well as the precision mode of that search, in terms of a choice of error models, given a cost of 
selecting the observation mode, generalizing the work of |28j . We develop a simple computational 
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algorithm for selecting optimal policies for sensing mode and precision mode selection. Section 
contains simnlation results using both two sensors and three sensors that illustrate the performance 
of our approaches. Section contains conclusions and directions for future work. The Appendix 
contains the proofs of the major results. 

2 Multi-Region Single-Sensor Search with Noise 

2.1 The Multi-Region Single-Sensor Model 

Consider the problem of localizing a stationary point object whose position is denoted by X, a 
continuous-valued random vector in a compact subset A of (d < 3 for our purposes) with 
prior probability distribution that is absolutely continuous with respect to Lebesgue measure, with 
density po(x). We assume this initial density has finite differential entropy. We have a single sensor, 
which can collect measurements of the object location. As in [2ll|29l and previous approaches to 
search theory [3], we avoid modeling explicit sensor locations and activities, and instead model 
sensor measurements as aggregate efforts over a domain of interest. In our formulation, a sensing 
mode is a partition of the domain A into K >2 disjoint Lebesgue measurable regions with 

each assigned the distinct integer label i m. {1,... A sensing mode will result in observed 

measurement values for the sensor. We assume that the sensors collect measurements in discrete 
stages by choosing its sensing mode at each stage. 

In the absence of measurement noise, the value of the measurement would correspond to iden¬ 
tifying which region contains the object X. That is, 

K 

i=l 

In our formulation, the sensor measurements include noise. The measurement obtained by the 
sensor, Y, will be a random variable that can be either discrete or continuous valued. For the rest 
of this paper, we assume that Y is discrete-valued, with values in a discrete set y. Our results 
extend in straightforward manner to the case where Y takes values in a continuous space. We 
assume measurements can be collected at each stage n, with the measurement noise is defined by 
the conditional probability distribution of Yn given the value Zn- 

P{Yn = y\Zn = k) = My), k = l,...,K 

We assume the measurements Yn,n = are conditionally independent given the object 

location X and sensing modes An,n = 1,... ,N. 

Our goal is to obtain N measurements sequentially to improve our knowledge of the object 
location X. Let A„ denote the sensing mode used for the measurement at stage n: this partitions 
A into K disjoint regions {An\--- and let Yn denote the measurement obtained under 

that mode. The information history collected by the sensor after the n-th measurement is denoted 
as 

Dn — {Ai, Y\, - ■ ■ , An, Yn} 

Let Pn{x) = p{x\Dn) denote the posterior density of X given the history Dn. We denote this 
quantity as the information state at stage n. The evolution of this information state across stages is 
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derived using Bayesian reasoning as follows; Assume we know Pn{x), and we obtain a measurement 
Yn+i = y given sensing mode An+i- Then, 

n Ixl =V (x) _ PQ^n+l = y\An+l,X = x) _ 

f;Y = y\An+i,X = a)da 

Ef=i/fc(y)l|^g^W I 

= Pn(x) • 7 - . r . .. - Y ( 1 ) 

JxPni^) Ek=i 

The above evolution can be viewed as a stochastic dynamical system for the information state 
Pn+i{x), where the evolution depends on the finite-valued random “disturbance” y, with conditional 
probability density p{y) = Pn(o') /fc(2/)l|^g^(fe) yda that depends on the current informa¬ 

tion state Pn{x) and the control action An+i- As long as ri{y) > 0, the resulting information state 
Pn+i is well-defined, and will represent a probability density on X . For r]{y) = 0, we arbitrarily 
dehne Pn+i{x) = Pn{x). 

A useful quantity in our development is P{Zn+i = k\pn-, A^+i) = computed as 

ujfli = Pnicr)da > 0 (2) 

We refer to itn+i = {Un+i^ • • • > ^i+i) operating point at stage n-|-l. Note that 1 

because A^+i is a Lebesgue-measurable partition of X. With this notation, the denominator in 

Bayes’ rule 0 becomes y{y) = Yk=i 4+i/fc(2/)- 

Let r^(A’) denote the set of all partitions of the domain X into K measurable subsets. Let 
Sn denote the space of probability densities Pn{x) over X, corresponding to distributions that are 
absolutely continuous with respect to Lebesgue measure. We define an adaptive sensing policy 
TT = (vri,7r2,--- ,vr 7 v) to be a sequence of functions 7r„ : 5„_i —)• r^(A’), which will map the 
information state pn-i{-) into the sensing mode used at stage n. Let 11 denote the space of all 
adaptive sensing policies. 

To hnalize the problem formulation, we dehne the objective function. We will evaluate the 
quality of our knowledge of X after collecting information Dn by its posterior differential entropy 
H{pn) dehned as 


H{Pn) = - Pnix)log2Pn{x)dx 

Jx 

Our goal is to minimize H{pY — the posterior differential entropy after N measurements. The 
problem of interest is to choose the adaptive policy vr to minimize H(p]sf): 

inf E[H{pn)\po] (3) 

ttGII 

Note that our objective is a nonlinear functional of the information state, unlike the standard 
models for partially observed Markov decision processes HZ! where the hnal objective is a linear 
functional of the hnal information state. 

The above dynamic decision problem can be viewed as a perfectly observed Markov decision 
problem with inhnite-dimensional state space Sn, stochastic dynamics with discrete-valued dis¬ 
turbances 0, and terminal cost objective 0. The admissible control space T^{X) for each 
information state Pn{-) has none of the typical topological structure (e.g. a Borel space or a metric 


5 




space) assumed in most dynamic programming results. Still, our problem satisfies the structure for 
stochastic optimal control with countable disturbances described in Chapter 3 of [32]. We define 
the optimal value function V{pn, n) at the stage step n to be: 


V{pn,n)= inf E[H{pN)\Pn] 

("TTn + l ) 

The optimal value function has to satisfy the Bellman equation j32| : 

V{Pn,n)= inf EY„+^[V{pn+un + l)\An+l,Pn] (4) 

An + 1 

Furthermore, if a policy vr* satisfies 

[V{pn+i,n + l)\ 

'^n+l {Pn),Pn\ = V{pn,n) 

for all pn, then the policy is optimal. 


2.2 Optimal Policies for Multi-Region Single-Sensor Search 

To derive the optimal policy, we consider the reduction in expected posterior differential entropy 
H{pn) — E[H{pn+i)\An+i,Pn] that results from a sensing mode An+i based on information state 
Pn{x)- The following proposition summarizes our result: 

Proposition 2.1. The expected reduction in posterior differential entropy from a sensing mode 
An+i is given in terms of the operating points Un+i = • • • j I'n (§ , as 

H{Pn) ~ d^Yn+i[H{Pn+l)\An-\-l,Pn] — 

where 

K K 

p{Un+l) = fkiyWnh) - T Wn+l'^(/fc(y)) 

k=l k=l 

where TL is the standard Shannon entropy for discrete-valued distributions. 


The proof is shown in the Appendix. One way of interpreting (p{un+i) is to consider Un+i as a 
probability distribution for the values of a discrete-valued random variable Zn+i, with P{Zn+i = 

j) = 4+1- Then, 

f (4^+1) ^n+l) 

where the mutual information for two discrete-valued random variables is defined in terms of the 
Shannon entropy as 

I{Y;Z) ='H{Y)-'H{Y\Z) 

This is readily established as 


i{Y;Z) = -J2 T Piy\z)P{z)\og Piy\z)Piz) + T 

VGy ze{l,.....K} zG{l,...,K} zG{l,....if} 

= -Yl T ^og + Y T •^'=(2/)log/fc(y) 

vGy ke{i.....,K} ke{i,...,K} ke{i,.....K} yey 
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Note that ^{u) = as defined in Proposition 2.1 is strictly concave over the 


simplex J 2 k=i ~ 


u 


(fc) 


> 0, k = I,-- - ,K. This follows from the strict concavity of 


the Shannon entropy ^(/). Thus, it has a unique maximum value achieved at a unique point 
u* = • • • , Any partition for which the statistics in Q are equal to u* achieves 

the maximal differential entropy reduction at stage n + 1. Note that the optimal operating point 
u* does not depend on the posterior density Pn{x) or the partition A^+i- 

Next, we show that, for any operating point u* and information state pn{x), there exists a 
sensing mode with partition A„_|_i for which ri(A„_|_i,p„) = u*. Let d denote the dimension of the 
Euclidean space containing X, and let e denote the d-dimensional vector of all Is. Since Pn{x) 
corresponds to a distribution that is absolutely continuous, the cumulative distribution function 
Pnix) = ■ ■ ■ J^^Pnix')dx' is continuous, and monotone nondecreasing on the diagonal x = 

ae, starting at 0 for a <= —C, and increasing to 1 for a >= C for some C because of the 
compactness of X. Hence, for any we can find a value oi so that P(aie) = and we 

can set = {a: < oie} H X, where the inequality is interpreted element wise. Similarly, for any 

such that < 1, we can find 02 > oi such that P{a 2 e) — P{aie) = and set 

= {are < x < 026 } n X. We continue this construction to obtain the final ax = C, because 

— 1- The final partition An+i so constructed satisfies u{An+i-,Pn) = u*. Note that 
there are many other partitions that would also satisfy this equality, which implies that the optimal 
partition is not unique. 

What remains is to show the optimal solution to the multistage adaptive policy optimization 
problem Q can be constructed in terms of the above adaptive sensing policy. 

Proposition 2.2 (Optimal Policies for Single-Sensor Multi-Region Search). Let ... ,^(^1*) 

= argmax.^^^^{i) ... ^(K)'^p{u) for (p{u) as defined in Proposition 


2.1 


For each stage n, select a 
sensing mode An that satisfies u{An-\-i,Pn) = u* ■ Then, this adaptive set of policies is optimal for 
problem Q . Furthermore, the optimal value function is given by 


V{pn,n) = H{pn) - {N - n)p* 


(5) 


where the constant p* = 

The proof is included in the Appendix. We note at this point that the optimal single stage 
entropy reduction p* is equal to the information-theoretic channel capacity C of a memoryless 
communication channel with input the discrete variables Z and output the observations Y: both 
quantities are defined by the same optimization problem. 

An important property of the above solution is that the optimal feedback strategy does not 
depend on the length of the planning horizon N. Thus, the resulting strategies are optimal for any 
duration of the planning horizon, resulting in search algorithms that are optimal no matter when 
the search terminates. 

The above results exploit several special structures of our adaptive control problem, as discussed 
below: 


• The object location must have a prior distribution over a continuous region that is absolutely 
continuous with respect to Lebesgue measure. This leads to conditional cumulative prob¬ 
ability distributions that are continuous, and enable us to construct strategies that satisfy 
the optimality conditions. This would not be the case if the potential object locations had 
distributions that were not absolutely continuous. 
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• The differential entropy objective function allows for separability of the contribution of new 
information from past information, a critical step in the development of optimality condi¬ 
tions. Replacing the objective by similar functions such as Renyi entropy or other similar 
divergence measures requires additional conditions to guarantee concavity as well as existence 
of minimizing strategies. 

• The measurement error models do not depend on the size of the regions used in the partitions 
at each stage, and depend on X only through the indicator that X is in particular regions. 

There are special cases where the optimal solution is known explicitly. One such case is when 
the measurement error model satisfies a special symmetry condition. The error model from Z to Y 
is modeled as a noisy discrete memoryless channel. This quasi-symmetry condition requires that 
the set of outputs y can be partitioned into subsets such that, for each subset, the sub¬ 
transition probability matrices P{y\z) for y G 2 £ {Ij • • • satisfy the property that the 

each row is a permutation of every other row, and each column sums up the same subset-dependent 
constant. When this channel has the property of quasi-symmetry [33], or otherwise satisfies the 
property of symmetry as defined in jM], the optimal operating point satisfies • • • , = 

(l/R,..., l/RT) ([HI, Thin 4.5.2). 


2.3 


Mean-Square Error Lower Bound on Performance of Multi-Region Single- 
Sensor Search 


after n sensing 
This allows us to give 

a lower bound on the performance of the minimum mean-square error estimator, similar to the 
results in 


From Proposition 2.2, the maximal expected posterior entropy reduction is nip' 
stages are completed, where p* is the same as defined in Proposition 2.2 


Proposition 2.3 (Mean-Square Error Lower Bound). Assume H(po) is finite. Then, the minimum 
mean-square error estimator at stage n Xn = under any admissible policy has the 

following mean-square error lower bound: 




27re 


where d is the dimension of the object space and Cq = and p* is defined in Proposition |g.4 

This lower bound decays exponentially with the number of stages, at a rate that is proportional 
to the maximal one-stage expected entropy reduction p*. 


3 Boolean Multi-Sensor Search with Noise 

3.1 The Boolean Multi-Sensor Model 

As a special case of our previous results, we consider a problem where there are M { M > 2) Boolean 
sensors; each sensor can select a single region of observation, which is a Lebesgue measurable 
subset of A, and receive a noisy answer as to whether the object X is in the region. The sensors 
simultaneously collect measurements at each stage, and coordinate their sensing modes to develop 
adaptive sensing strategies. This search model is similar to the joint sensing model studied in 
|29j . although our emphasis is on non-binary, non-symmetric error models whereas most of |29) 
focuses on binary symmetric error models. We assume that each sensor collects discrete-valued 





measurements, each of which takes values in a discrete set y. A decision at stage n consists of 
selecting sensing modes {An \ ..., A^'^) for all M sensors in a batch, where An \ ..., A^'^ C X. 

Given the sensing mode for sensor m, the error-free measured value corresponds to whether or 
not the queried subset aI^'^ contains the object X as before; 

^(m) = 1 for sensor m at stage n 

\ A G-Att, I 

As before, we assume that the noisy measurements are discrete-valued, taking values in a discrete 
alphabet y. The statistical distribution of the collected noisy measurement for each sensor m is 

= im) = ^y, im^ {0,1}, for sensor m at stage n 

We assume that the noisy measurements Yj^\ ... ,Yn^'^ are conditionally independent across sen¬ 
sors given the true object location X and the sensing modes A^^\n = 1,..., N,m = 1,..., M. 
This makes the error channel from the true indicators to the measurements memoryless. 

Based on the conditional independence assumptions, we define the conditional density of the 
joint measurements given the indicator variables ii,... Am associated with the sensing modes and 
the state X, as 


M 

II (6) 

m=l 


where we use ii^M as a shorthand for (ii,..., zm)- 

As before, we will collect observations in N stages. At each stage, as the sensors make decisions 
and obtain observations, the observed noisy values are shared and the posterior probability density 
of the object location X will be updated for all sensors. Note that the posterior density of X is 
only updated after all the measurements are obtained and shared. 

Let An = {An \ ..., denote the batch sensing modes used at stage n, and Yn = 

{Yn^\ ... ,Yn^'^) denote the batch noisy observations collected at stage n. The information his¬ 
tory after stage n is denoted by Dn = {Ai, l^i,..., A^, Yn}- Let the information state after stage 
n be denoted as Pn{x), which is the posterior density Pn{x) = p{x\po, Dn)- This information state 
evolves after collecting observations Yn+i = y = {y^^\ - - - for all M sensors with batch 

sensing modes An+i = (A^^^,..., A^^'>) as follows: 


Pn+l{x) =Pn{x) ■ 
= Pn{x)■ 


PiY n +1 — y\An+l} X — x) 
IxPni(x)P{Yn+i = y|A„+i, A = a)da 

fxPn{(x) I]q=o ■ ■ ■ YliM=o 


where we use the notation {B)^ = and {B)^ = B A B \s & subset of X. 

Define the collection of statistics, called the joint operating point, as u = - - - ,iM £ 

{0,1}} as 







Pn{(r)da 


> 0 


(7) 
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Then, P{Yn+i = y\^n+i:Pn) = viv) can be evaluated as 

11 11 
r]{y) = where ^ ^ = 1 

ii=0 H/=0 2i=0 

The quality of our knowledge about the object position X is evaluated by the posterior differ¬ 
ential entropy H{pn) of the posterior density Pn{x), defined as 

H{pn) = - Pn{x)log2Pn{x)dx. 

Jx 

where X is the domain of X. 

Let Sn denote the space of probability densities Pn{x) over X. Let r(T’) denote the set of all 
Lebesgue-measurable subsets of X. We define an adaptive joint sensing policy vr = (vri, 712 , • • • , vrjv) 
to be a sequence of functions where 7r„ : S'„,-i —)• T{X)^ will map the posterior density pn-i{x) 
into admissible batch sensing modes An = {An \ ■ ■ ■, Let IT denote the space of all adaptive 

joint sensing policies. 

Our goal is to minimize H{p^) — the posterior differential entropy after N stages of joint 
sensing: 


inf E[H{pn)\po] 

ttGII 


To view this as a special case of our previous results, we simply need to recognize that a set of 
joint sensing modes A = ..., induces a partition of the region X into 2^ subsets. For 

each k G {0,..., 2^ — 1}, let fi,..., denote the dyadic expansion of k. Then, the subset k in 
the partition can be identified as 

A = 

Similarly, given any partition A of the region X into 2^ subsets, we can use the dyadic expansion 
of k to define joint sensing modes 

. 4 '”' = 

By identifying the joint set of observations as a discrete valued observation Y in 

a discrete-valued set with conditional probability distribution as (©, we can map this problem 
into a special case of the multi-region single-sensor problem. Thus, the optimal solution for the 


Boolean multi-sensor problem is a special case of the results of Proposition 2.2 This result has 
also been obtained directly for Boolean multi-sensor search in Theorem 1 of |29j . We highlight this 
solution below. 

One of the quantities of interest is the expected entropy reduction of batch sensing modes A^+i, 


corresponding to Proposition 2.1 This takes a special form for the multiple Boolean sensor case: 


^(u) = niY^ ■ 

ii=0 


1 1 

W=0 ii=0 


1 

*M=0 


n{q, 


*1:M / 


(8) 


where u is defined in Q and 12 /^^^) is defined in ®. 

Proposition 3.1. Given a batch of sensing modes for stage n + 1 as An+i = ..., A^^'i). 

H{pn) - E[H {pn+l)\An+l = . . . , , Pn] = p{u) 
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The strict concavity of the Shannon entropy 'H{f) can again be used to show (p{u) is strictly 
concave and has a unique maximum at u*. Furthermore, we can always find a joint sensing strategy 
that achieves this optimal value using the correspondence identified above and the construction in 


the proof of Proposition 2.2 


Proposition 3.2. At each stage n, we can always find a batch of sensing modes An = • • •, 

such that 





Pn-l(x)dx = 


These results can be used to establish the following result: 

Proposition 3.3 (Theorem 1 in [29]). Letu* = argmax^^ (^(u) for ip{u) as defined in Q. For each 
stage n, one can select batch sensing modes An = • • ■ ,A^^'>) that satisfy u{An,Pn-i) = u*. 

Then, this adaptive set of policies is optimal for the multi-sensor joint sensing problem. Further¬ 
more, the optimal value function is given by 


V{pn,n) = H{pn)-{N-n)ip* 


(9) 


where (p* = <p{u*). 

Note that obtaining u* as indicated above for general discrete error models requires solution of 
a large concave maximization problem (with 2^ variables), which can be time-consuming. We show 
next that we can obtain this optimal solution through the solution of M scalar convex optimization 
problems, which is a much simpler problem. 

One way to view the results of Proposition |3.3| is to connect the maximal entropy reduction at 
each stage to the concept of channel capacity. Each sensor m can be viewed as a discrete memoryless 
stationary channel (DMSC) whose input is G {0,1}, output is Yim) 

G y, and transition 

probabilities are specified by f^'^ and Furthermore, we can regard 

defined in Q as the transition probabilities of a “mixed” vector (product) channel for all the 
M sensors used in joint sensing, shown in Fig. The inputs of this mixed channel are vector 
..., Z^^'>) G {0,1}^, and the outputs are (T^^^ ..., G . The capacity Cmix of this 

channel is equal to p*, the solution of our optimization problem in Proposition |3.3[ 





Figure 1: The “mixed’ channel. 

Consider now the problem when there is only one sensor present, as in [2T]. Define 
to be the single sensor expected differential entropy reduction that sensor m can achieve on its own 
by selecting its sensing mode A^'^\ as 

^(m)(^M) ^ + (1 _ - «M?^(/(™)) - (1 - n(™))?^(/o^™^) 
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and define the maximum expected entropy reduction to be 

u(m) 

The following proposition is proved in the Appendix: 

Proposition 3.4. Consider general discrete-output sensor error models. Denote the optimal oper¬ 
ating points of each individual sensor m as = argmax^{m) , m = 1,... ,M. Then 

the optimal operating point for joint sensing, i.e., u* = = argmax^ is given by 

M 

<:M = n (10) 

m=l 


In addition, 


M 


T = 

m=l 


(m)* 


Thus, the optimal joint operating point for the multiple Boolean sensor case with discrete 
measurements can obtained from the optimal single-sensor operating points. Furthermore, we 


can now use the construction of Section 2.2 to obtain partitions of the region X that achieve the 
probabilities required by the joint operating point, and combine them to obtain the joint optimal 
sensing modes for each sensor mG{l,...,M}. 


The results of Proposition 3.4 can also be used to identify an equivalence between joint sensing 
and sequential sensing for general Boolean sensors. The authors in [29] propose a sequential sensing 
scheme where each stage is divided into M sub-stages. At each substage m, the m-th sensor selects 
a sensing mode based on information state pn,m-i, where pnfi = Pn{x), and collects its noisy 
measurement. This measurement is processed to obtain an updated probability density for the 
object location Pn,m{x). This information is made available to the next sensor m -|- 1, which in 
turn selects its query based on Pn,m{x). The stage completes when the M-th sensor collects its 
measurement, and uses it to update pn,M-i{x) to produce pn+i{x). 

This sequential sensing scheme is similar to the binary single-sensor search problem studied in 
|21j . with the minor extension that the sensor error models used for each substage are not time 
invariant. From [2T] we know that the optimal policies are the ones that select a sensing mode 
to maximally reduce the expected posterior entropy in each single substage. Thus, the optimal 
expected differential entropy reduction for the sequential policy at the end of one cycle is precisely 
E"., »><”>( which is the same value ip* that would be achieved by the joint sensing scheme 

in Proposition |3.4[ This establishes the following lemma for general discrete observation Boolean 
sensors: 


Lemma 3.5. Consider general discrete-output sensor error models for both the sequential and joint 
sensing models described above. Then, the optimal performance achievable at the end of a stage is 
the same for both sequential and joint sensing. 

The above lemma extends the results of [29] to Boolean sensors with general discrete error 
models and non-binary measurements. Note also that we can shuffle arbitrarily the orders in which 
sensors take measurements in a substage and achieve the same result. 

To illustrate these results, consider a problem with two Boolean sensors / and g whose error 
models are specified in Table[^ For each individual sensor, the optimal operating points correspond 
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y = 1 

y = 0 

y' = l 

y' = o 

fi{y) 

a 

1 — a 

gi{y') 

13 

1-13 

My) 

1 - p 

P 

go{y') 

1 - A 

A 


(a) Sensor /. (a ^ p) (b) Sensor g. {/3 / A) 


Table 1: Two Boolean Sensors with Non-Symmetric Binary-Output Error Models 


to the points that achieve maximal capacity in an asymmetric binary channel (e.g. [35]), and are 
given by: 

(/)* ^ p{l + ki)- 1 

{a + p-l){l + h) 

y(9)* = A(l + fci)-l 

(d -l- A — 1)(1 -|- ki) 

where 


/a“(l-a)(i-“As+7rrT 


V pP(l - p)(i-p) / ’ 

VA^(1 - A)(1-^)/ 


We can combine these solutions as in Proposition 3.4 to obtain the following joint operating 
points: 


(p(l + fci) - (a(l + ki) - (A(l + fca) - + fe) - k2)^-^^ 


(a -|- p — 1)(1 -|- fci) 


(/? + A — 1) (1 -|- /C 2 ) 


, *i,*2 e {0,1} 


There is a further simplification where the optimal operating points for the joint sensing problem 
can be computed analytically. The results in [29| show that, when each of the M sensors has a binary 
symmetric error model, the optimal operating points are and the optimal individual 

sensor operating points are = 1/2. We extend this further to non-binary error models where 

|T| > 2. Specihcally, we consider the case of Boolean sensors with error models that correspond to 
quasi-symmetric memoryless channels |33j (also called symmetric discrete memoryless channels in 
|34j . For Boolean sensors, quasi-symmetric error models satisfy the following condition: 

Definition 3.6 (Symmetry condition). A Boolean sensor has a quasi-symmetric error model if 
there exists a permutation xO ■ y ^ y such that, for all y G y, fi{y) = foixiv)) fo{y) = 

fiixiy))- 

The implications of having quasi-symmetric error models are: 

H{h) = n(M 

y&y y&y 

These lead to the well-known result [331 l34] that the optimal capacity for quasi-symmetric 
discrete memoryless channels is achieved by a uniform input distribution. For Boolean sensors, 
this means that the optimal sensing mode at stage n with information state pn-i is to pick a 
subset An G X such that pn-i{x)dx = 1/2. This establishes the following result as a direct 

application of Proposition |3.4[ 
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Corollary 3.7. If we have M Boolean sensors with quasi-symmetric (albeit different) error models, 
then the maximum value of ip{u) occurs at = 2“^. 

When the error model is a discrete-continuous channel, 3^ is a subset of the real line and 
fo{y)i fi{y) are probability densities absolutely continuous with respect to Lebesgue measure. In 
order for a single Boolean sensor to achieve its optimal capacity with uniform input distribution 
u = 0.5, 1 — u = 0.5, we must satisfy the following conditions; 

H{h) = n(M 

f [ (/„(,)) log 

Jy&y ^ Jy&y ^ 

When these conditions are satisfied, the derivative of (p with respect to u is: 

^ = - / {h{y) - h{y))'^og{ufi{y) + {I - u)ff{y))dy - H{fi) + H{fo) 

du Jy^y 

This derivative vanishes at u = 0.5 when the above conditions are satisfied. Coupled with strict 
concavity in u of the (p function, this leads to the optimal operating point for single sensor with 
discrete-continuous error model to be at u = 0.5. 

A sufficient condition on the densities fi{y), fo{y) to satisfy the above conditions is that there 
exists some constant a such that fo{y) = fi{a — y) for all y. In this case, the differential entropies 
H{fi) and ^I(/o) are equal, and the equality 

/ j (/„fo))log"=^i±AM,, 

Jy&y ^ Jyey ^ 

is easily verified. These conditions are satisfied when the error models correspond to additive white 
Gaussian channels, so that 

In — hi^Zji) Wn 

where h{Zn) is a function of the Bernoulli variable Zn and Wn is zero-mean with Gaussian distri¬ 
bution. 


4 Boolean Multi-Sensor with Precision Modes Selection 


In [28], Sznitman et al. generalized |2T] to the setting where at each sensing stage, the Boolean 
sensor is allowed to choose a precision mode from a finite number of precision modes, in addition 
to its observed subset. A precision mode for a sensor selects a particular error model, and there 
is a cost associated with selection of different precision modes. Different precision modes will 
trigger different sensor error models, but there are also costs associated with the precision modes. 
Precision modes with better error models will cost more to use, but may result in greater reduction 
in differential entropy. In this subsection, we will extend this to the Boolean multi-sensor joint 
search problem, allowing each sensor to choose its precision mode at each stage. 

Assume that for the m-th sensor, the set of its possible precision modes is 
A joint search decision at stage n consists of selecting both sensing modes and precision modes for 
all M sensors {An \ ln \ ■ ■ ■, l^'^), where C X and G for Vm. 

The corresponding error models of the m-th sensor under precision mode = I is characterized 
by and f^'^\ defined as: 


P(yM=2/|AM=A,e)=i,A) = 


f^rHv) 


ax e A 

otherwise 
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When sensor m selects a precision mode = I, it incurs a cost Note that this cost 

depends on the sensor index m and the precision mode I, but does not depend on the sensing mode 
A or the time instance n (although this time invariance restriction can be easily relaxed). 

Define the information state Pnix) = p{x\{Ai,li),Yi,..., {An,ln),Yn). The state evolves 
according to Bayes rule as before, where the choices of precision modes are used to select the 
appropriate likelihoods for interpreting the observed measurements Y. A joint sensing policy 
TT = (tti, ..., ttat) with precision modes selection is a sequence of functions that map the information 
state pn-i{x) to admissible batch sensing and precision modes (A„, Z„) = {An \ ln \ ■ ■ ■, Ai^\ 
at each stage n. Denote the policy space by 11. 

Our joint sensing objective is to minimize the sum of the final-stage expected posterior differ¬ 
ential entropy and the total cost of all sensors discounted by a factor 7 . The resulting : 

N M 

TTfcll ‘ ^ ‘ ^ 

t=l m=l 

This stochastic control problem fits the countable disturbance model of [32] . Define the optimal 
value function V{pn, n) at stage n to be: 


N M 

V{pn, n) = inf E[H{pn) + 7 E E 

t=n+lm=l 

The Bellman equation for this problem is: 

M 

V{pn,n)= inf EY^+A'^{Pn+l,n + l) +-f^W^"^\li^\)\{An+l,ln+l),Pn] 

-An + l5^n+l 

m=l 

If a policy tt G 11 achieves equality in the Bellman equation, it is an optimal policy. 

Following our previous approach, consider a set of sensing decisions (A„+i, Z„+i) at stage n -|-1. 
Define the joint operating point u = ...,zm £ {0,1}} as 


'Wq-M = / Pn{cr)da > 0 (11) 

For a given set of sensing decisions at stage n -|-1, consider the one-stage gain to be the expected 
reduction in differential entropy minus the cost of the precision modes by the sensors, as 

M 

G{pn,An+l,ln+l) = ^Pn) " E^^^,[n{pn+l) + 1 Y. (4+1) I (^n+1, ^n+l), Pn] 

m=l 

We have the following characterization: 

Lemma 4.1. Define the function 

11 11 M 

G{u,l)=n{Y-- - E 9Lm«U:m)-E-" E 

where u{-) is defined from pn and A as in © and 

M 

= II e 3’,+ e {o,i},Vm 

m=l 
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Then, for all sensing decisions, the one-stage gain can he expressed as 

G{pni -^n+liln+l) — G[Un-\-l, In+l) 

Thus, the dependence of the one stage gain on the information state and the selected sensing 
modes is summarized by the statistics u. This structure is exploited to derive the main result: 

Proposition 4.2. Consider the problem of finding the optimal sensing modes and precision modes 
for the Boolean multisensor problem. Define {u* ,1*) G argmax^^ G(ri,/). Then, at stage n, any 
policy that chooses batch sensing and precision modes {An, In) such that ln = l* = {l^^^*, . . ., l^^h 
and u{An,Pn-i) = u* is optimal. 

Furthermore, the optimal value function has the following closed-form expression: 

V{pn,n) = H{pn) - {N - n)G* 


where G* = G{u*,l*). 

Note that, for each I, the function G{u,l) is strictly concave in u. Thus, we can find the 
maximum value and maximizing argument for each I, and then select the maximum value among 
the possible choices of I to obtain {u*,l*). The maximizing argument may not be unique, because 
there can be multiple precision modes with the same maximum value. As long as the sensor error 
models are stage-invariant, there is an optimal strategy where each sensor will choose the same 
sensor-dependent precision mode at every sensing stage. 

We now show that optimal strategies for the solution of the optimal Boolean multi-sensor 
problem with precision mode selection can be computed from the solution of single sensor problems: 
Let 


G^^'>{u,l) = n{ufi^'^\y) + {1 


n)ft’'\y)) - - (1 - u)n{ft''\y)) - iw^^\i) 


and let {u^^> = argmax^; G^'^\u,l) denote an optimal operating point and choice of pre¬ 

cision mode for sensor m. 


Proposition 4.3. Let {u^^>,l^^>) = argmax^; Z) denote an optimal operating point and 

choice of precision mode for each sensor m = 1,... ,M. Then, an optimal joint operating point 
and precision modes for joint sensing {u* ,1*) can he obtained as: 


In addition. 


^*l;M 


M 

n( 

m=l 


U 


(m)* 


‘(1 




)!-*-; I* = {iW* 




M 




m=l 


where 


( 12 ) 
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5 Simulation Results of Finite Horizon Boolean Multi-Sensor Joint 
Search 

5.1 Two Boolean Sensors with Binary Symmetric or Ganssian Error Models 

In this subsection, we illustrate the previous results by simulating a two sensor scenario searching 
for a object. We assume the object is located in the unit interval [0,1], i.e., X = [0,1]. We denote 
the two sensors as / and g, who will cooperate and share their sensed measurements to gain the 
knowledge of the object position X. The sensing modes of / and g at stage n are denoted as An 
and An respectively. 

In terms of sensing errors, we consider two types of models; In the first model, y is binary valued, 
with symmetric errors for each sensor, but with different probabilities of error. The measurement 
probability distribution functions foiu), fi{y)-,go{y) and gi{y) are summarized in Table j^a). 

For the second model, we assume that y is continuous valued. The measurement y corresponding 
to a sensing mode An for sensor / is given as 

Vn = ^{XeAi} 

where Wn is a Gaussian random variable, mean 0, variance 1. A similar model is used for sensor 
g, with the Gaussian random variables independent across sensors and stages. As required in 
our model, the additive noises are independent across stages, and between sensors. The resulting 
probability densities are summarized in Table j^b). 

Table 2: The sensor specifications for two types of Boolean sensors using optimal joint sensing 
policy, (a) Both sensors have symmetric error models, so u\i = ■ ■ ■ = Uqq = \- (b) Both sensors 
have Gaussian error models that satisfy symmetry, so u\i = • • • = Uqq = 



y = 0 

y = l 

y G M 

fo{y) 

0.8 

0.2 

My) 

y ~ AA(0,1) 

fiiy) 

0.2 

0.8 

My) 

y ~ AA(1,1) 

9o{y) 

0.7 

0.3 

9o{y) 

y ~ AA(0,1) 

9i{y) 

0.3 

0.7 

9i{y) 

y ~ AA(1,1) 


(a) Binary symmetric error models (b) Gaussian error models 


Given the sensor models in Table the first step is to find the operating points that minimize 
the ^p{u) as defined in propositionIn general, this would require a convex optimization problem, 
but the sensor models satisfy the s ymm etry conditions discussed in Section so the optimal values 
are u* = i, i, i) (by Gorollary|^. 

The optimal strategies at each stage n, based on the information state pn-i(x), is to find regions 
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An, An C [0,1] SO that 


/ Pn-l{x)dx = 

Jx&AinAl 

/ ^ Pn-l{x)dx 

J xf 


lx&Air\(AlY 


/xG 


(A^YnA?, 


Pn-l{x)dx 


1/4 
= 1/4 
= 1/4 



Pn-l{x)dx 


1/4 


While there are many subsets that can satisfy the above equalities, we choose our subsets 
to be subintervals, to resemble physical properties of sensors that will likely observe connected 
regions. Hence, we will partition X = [0,1] into four subintervals, each of which has probability 
1/4 according to pn-i{x), corresponding to An n {AnY, An n An, (AnY H An, (A^Y {AnY- This 
construction is illustrated in Fig. The subintervals are then used to identify the sensing modes 
An, An for each sensor at stage n. 


Af f 

^ -*-- 

r—- \ 

AfniA^Y AfnA<^ '{aY^^ao {AfYniA^Y 

- " “ 1-‘-r 1 


0 


1 


Figure 2: Partition of a line segment into four disjoint subsets at each stage. The domain X of the 
object position is the line segment [0, Ij. At each stage, sensor / will inquire subset and sensor 
g will inquire subset A^, thus partitioning X into four disjoint subsets. 


For each of the sensor models, we conducted 100 Monte Carlo experiments. In each experiment, 
we randomly generate a object position X € X = [0,1] using a uniform distribution. We initialize 
our prior density for X,pq{x) as a uniform distribution; therefore, the initial differential entropy 
H{pq) = — /q log 2 (l)(ix = 0. At each stage n > 0, given the density pn-i{x), sensing modes An, An 
are selected, and random measurements [yn, yY) are generated according to the sensor error models. 
These measurements are used to update the conditional density from pn-i{x) to Pn{x) as indicated 
in Subsection 3T We continue this process until n = 24 sensing stages are completed. 

For each experiment, we plot the differential entropy H{pn) as a function of n. Fig. [^a) contains 
the results for the discrete measurement sensor model. At each stage n, the plot displays the 100 
sample values of H{pn)- The plot also shows four sample trajectories of experiments as dashed lines, 
to show the randomness in the actual trajectories. Note that the sample trajectories of posterior 
differential entropy are not monotonically decreasing, as they depend on random measurement 
values. The figure also shows in red the average of the 100 samples, which follows the linear 
descent predicted by the optimal value function in Proposition |3.3[ 

Fig. [3];b) shows similar results for the continuous measurement sensor model. The vertical 
scales of the two plots are different. The average slope of the posterior differential entropy decays 
slower in this graph, and the distribution of potential values has more support on higher values of 


differential entropy. The graph still shows the expected linear decay from Proposition 3.3 
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(a) Binary symmetric error models (b) Gaussian error models 


Figure 3: The entropy reduction paths for two types of sensors. The red line in each hgure is 
the average entropy reduction path over 100 realizations. We can see that the average entropy 
reduction paths for both types of sensors are a straight line, i.e., the average entropy decreases 
linearly, (a) Both sensors have binary symmetric error models, as described in Table(a), and we 
obtain that u\ = ■ ■ ■ = u\ = \. (b) Both sensors have Gaussian error models, as described in Table 
(b), and we obtain that u\ = ■ ■ ■ = u\ = \. 


5.2 Three Boolean Sensors with Ternary-Output Error Models 

In this subsection, we simulate searching for a object in X = [0,1] using three sensors jointly, 
denoted as f, g and h. The sensing modes of f, g and h at stage n are denoted as An, An and A^ 
respectively. 

The sensor error model we consider here is shown in Table [Sl Note that the error model of each 
sensor satisfies the symmetry conditions; thus, the optimal joint operating point is 


u 


* 


* * * * 

■^OOOj '“oOl! '“oilJ '“ll0> ^llli ^lOli ^lool 


11111111 
*' 8 ’ 8 ’ 8 ’ 8 ’ 8 ’ 8 ’ 8 ’ 8 ^ 


Table 3: The error models for three Boolean sensors 



y = 0 

y = 1 

y = 2 

fo{y) 

0.3 

0.5 

0.2 

h{y) 

0.2 

0.5 

0.3 

9o{y) 

0.7 

0.2 

0.1 

9i{y) 

0.2 

0.7 

0.1 

ho{y) 

0.3 

0.1 

0.6 

hi{y) 

0.3 

0.6 

0.1 


The optimal joint sensing strategies at each stage n, based on the information state p„_i(x), is 
to find regions An, An, A^ C T so that 



Pn-i{x)dx = yii,i2,h e {0,1}. 
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To find regions An, An and A^, we first select {Ai^i^i^-. ii,i 2 ,i 3 £ {0,1}} as displayed in Fig. 
such that 


/ Pn-l{x)dx = Vfi,Z2,f3 G {0,1}. 



Figure 4: Partition of a line segment into 2^ = 8 disjoint subsets at each stage. 


Then, An, An and A^ can be constructed from {Ai^i^i^-. 11 , 12,13 G {0,1}}: 

-^n Gljj—Q Ujg_gTlj2i3, 

= Uj\=o Ui2=0^*l*2l- 

Similar as in the previous subsection, we conduct 100 Monte Carlo experiments under the 
optimal joint sensing policy for three Boolean sensors. In each experiment, the object position 
X € X = [0,1] is randomly generated from a uniform distribution. The prior density pq{x) is 
initialized to be uniform over X, making the initial posterior differential entropy to be zero. At 
each stage n > 0, we select the sensing modes An, An and A^ based on the current information 
state pn-i{x) and the optimal joint operating point u*. The noisy measurements {yL,yn,yn) 
randomly generated according to the sensor error models, and will be used to update the conditional 
density from pn-i{x) to Pn{x). The process continues until re = 20 sensing stages are complete. 

We compute the differential entropy of the conditional probability density of the object position 
at each stage of the 100 sample experiments. The results are plotted in Fig. The 100 sample 
values of H{pn) are displayed in black dots at each stage re. Four sample trajectories of experiments 
are shown in dashed lines in order to show the randomness in the actual trajectories. The average 
posterior differential entropy reduction path of the 100 sample paths is shown by the red solid line. 
It is clear that the average posterior differential entropy decays linearly as predicted by the optimal 


value function in Proposition 3.3 


6 Conclusion 

In this paper, we studied the problem of optimal adaptive search for a stationary object under the 
condition of noisy sensor observations. We generalized the formulation of [21] in two directions, 
first by allowing sensors to pose multi-valued queries, and second, by allowing the use of teams of 
sensors, as in |29j . We posed the adaptive sensing problem as a finite horizon stochastic control 
problem, using a Bayesian formulation for information processing. The objective function was to 
minimize the posterior differential entropy of the conditional density of the object location after a 
finite number of observations. 

For the multi-region single-sensor problem, we characterized the optimal sensing strategies as 
those that maximally reduce the posterior differential entropy at each stage, thus resulting in 
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stage n 


Figure 5: The entropy reduction path for three-sensor joint search under the optimal joint sensing 
policy. The red solid line is the average entropy reduction path over 100 realizations, from which 
we can see that the average entropy reduction decreases linearly. The specification of the sensors 
are shown in Table Since the error model of each sensor satisfies the symmetry conditions, the 
optimal joint operating point is u* = (|, |, |, |, |). 


a myopic or greedy strategy. These optimal sensing strategies must select sensing modes that 
partition the space into regions that have conditional probabilities of containing the object equal to 
a fixed vector of probabilities, denoted as the operating point. We derived an explicit solution for 
the optimal value function, and showed that such myopic strategies satisfy Bellman’s equation of 
stochastic dynamic programming. Furthermore, we provided a convex optimization algorithm for 
computing the operating point for optimal strategies, and a constructive procedure for computing 
the optimal sensing actions in real time. 

For the Boolean multi-sensor problems, we considered the case of sensors with general discrete- 
output error models. We showed that, as in j29|, the optimal strategies can be obtained using 
myopic strategies. We also established that the joint convex optimization problem for the joint 
operating point of the optimal strategies can be decoupled into individual scalar convex optimization 
problems, leading to a simple computational procedure for the solution of multisensor problems. In 
addition, we developed sufficient conditions for characterizing symmetry properties of general error 
models that allow for the analytic solution for the joint operating point of the optimal strategies. 

We extended our multi-sensor formulation to the case sensors can choose between different 
precision modes as well as sensor modes, at a cost that depends on the precision mode selected. 
A choice of precision mode changes the accuracy of the sensor error model. Our results extend 
the single sensor results in [28]. For this case, we develop an explicit solution to the optimal 
value function for the stochastic control problem, and characterize optimal strategies in terms of 
selection of a joint operating point and joint precision mode. We show that this joint sensing 
mode and precision mode selection can be decoupled into single-sensor scalar convex optimization 
problems, thereby providing an efficient solution for constructing joint optimal sensing strategies. 
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An important note about our results is that they are tied intrinsically to the choice of perfor¬ 
mance criterion for the stochastic control problem, and the fact that the unknown object location 
has an absolutely continuous probability distribution with respect to Lebesgue measure. The choice 
of posterior Shannon differential entropy as the primary measure of performance allows the applica¬ 
tion of many concepts from information theory that lead to the explicit optimal solutions derived in 
this paper. The fact that the object location distribution has a density allows us to generate sensor 
modes that can achieve the probabilities determined by the optimal operating points. Changing the 
primary objective function to related information measures such as Renyi entropy would invalidate 
many of the optimality results derived in this paper. 

There are several directions in which this paper can be extended. One important direction 
is to develop approaches for approximating the optimal search strategies when sensors consist of 
physical platforms that must move over the region of interest to do the search. Similar issues 
arise in the results in classical search theory [3] which computes optimal allocation of search effort 
without focusing on individual platforms. Developing physically realizable sensor plans is necessary 
for implementation in multiple sensor platforms. Another extension is to consider problems where 
platform motion results in constraints as to how sensor modes can change sequentially. A third 
extension is to consider problems where sensors have constraints on the types of areas that sensors 
can observe, based on geometric constraints on sensor field of view. Other extensions include 
problems where multiple objects are present and need to be localized, and problems where different 
sensor modes have mode-dependent costs. It is unlikely that the structures exploited in these 
problems will generalize to those formulations, so the focus will be on getting lower bounds on 
performance, and developing practical approximation strategies based on such bounds. 


Appendix 


Proof of Proposition |2.1| 

Proof. Letr/o(y,x) = viv) = Pn(.x)^oiy^x)dx = Ef=i ^l+i/fc(y)- Then, 

using Bayes’ rule as in Q, we get 


pYn+i[P{Pn+l')\.^n+lTPn] — ^ ^ R(Pn+l )P(hn+l — Vl^^n+l^Pn) 

yey 

= [Pn (x) [Pnix) ~ J Pnix) log Pn{x)dx 

+ / Pn{x)'qo{y, x) log r]{y)dx - Pn{x) Vo{y,x) logr]o{y,x)dx 

v&y 


=H{Pn) - 


=H{Pn) - 


K 

P{viy))-Yl / Pnix)'H{fk)l^^^^Wjdx 


K 


K 




k=l 


k=l 


□ 


Proof of Proposition 


2.2 
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Proof. To establish this, we show that ([^ satishes the Bellman equation Q and the above policy 
is a minimizing policy. The optimal value function is is correct at stage N, as V^pN, = H{pn)- 
Assume by induction that the optimal value function satishes Q for ah A: > n + 1. Then, 


V{pn,n) = mi EY„+i[V{pn+i,n + l)|An+i = A,pn] 


= EY^+i[H{Pn+l) -{N -n - l)p*)\An+l = A,Pn] 
= inf .Ey„_^jF(pn+i)|A„+i = A,pn] - {N - n - l)p* 


= H{pn) - sup 

A 


K 


niY.Unllfk) 


K 


(k) 


k=l k=l 

because of Proposition |2.1[ Furthermore, we know that 


- {N-n-l)p* 


sup 

A 




k=l 


k=l 


= P 


* 


because, given pn, there is a partition A such that = uik)*_ Thus, 

V {pn, n) = V {pn, n)-p* -{N -n - l)Lp* = V (pn, n) -p* - (N - n-)(p* 

Furthermore, we have already provided a construction for choosing A^+i that achieves the supre- 
mum: u{An+i,Pn) = u*■ 

□ 


Proof of Proposition |2.3| 

Proof. Let = f^xpn{x)dx and = E[{X — Xn){X — Xn)'^]. By Theorem 17.2.3 in [22] and 
Jensen’s inequality, under any policy (, we have 


Ec[H{pn)] <.E^[-log((27re)''det(S„)) 


log(27re)'^ + ^ log(det(F;^[S„])) 
= ^ log((27re)‘’*det(F;^[Sn])) 


where det(-) denotes the matrix determinant. From Proposition 2.2 under any policy we have 
E(^[H{pn)] > H{po) - np*. By letting Co = we have 


Coe~‘^^’P* 

(27re)'^ “ {2TTeY 


< det(.E^[S„]) 


where tr(-) denotes the matrix trace. Since the determinant and the trace of a square matrix can 
be written as the product and the sum of the eigenvalues of the matrix respectively, using the 
inequality of arithmetic and geometric means we have 

det(F;cPn]) < 

Combining and rewriting the inequalities above, we get 


E[\\X - XnWl] = E^[tr{Er,)] > 


d\/Co _ ^nip* 

-e d 

27re 


□ 
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Proof of Proposition 


3.1 


Proof. Let r]Q{y,x) be defined as 


r]o{y, x) = P{Yn+i = y\An+i,X = x) 


Then, 

EY^^AH{Pn+l)\Ar,+ l = 

= X! H{pn+l)P{Yn+l=y={y^'^\---,y^^'')\An+l = {.A^^\---,A^^A,Pn) 


v<^y^ 


= - E -((!/)( /lP„W^llogb,.(x'’'“‘"'"’ 


’f(») n(y) 

= - / Pn{x)\ogpn{x)dx + ^ / Pn{x)yo{v, x)\ogy{y)dx - / p„(a;) ^ 77o(y, a;) log ryo(y, a;)(ix 

aar yey^ ^ ^ yey^^ 

fii/- 

11 11 
= ff(Pn) - H 9n:M(y)Wii,M) - XI ■ ■ ■ X 

2l=0 ^M—O 2l=0 ^M—O 


□ 


Proof of Proposition 


3.2 


Proof. The proposition follows because JT G is a continuous random variable with continuous 
cumulative probability distribution, whose posterior density at stage n is p„_i(x). Thus, given 
u* > 0 , X]ii=o ■ ■ ■ Y1 \m=o M ~ results for the multiregion single sensor problem 

to find a partition of the domain of X into 2^ disjoint subsets {Sij.jvf : ii-.M G {0,1}^} such that 
the probability of each subset is /„ pn-i{x)dx = u* , 'iii-M £ {Oj 1}^- Then by letting 

^ 1 • A/f 1: iVi 


Vm = 1,... ,M, we can realize u*. 


□ 


Proof of Proposition 


3.4 


Proof. We first prove that ip* < J2m=i 


11 11 

=H(X ■ • • X <:M • 9n:M) " X ’ ’ ' X 

2l=0 ^M—O ii=0 ^M—O 

From the additivity property of the Shannon entropy, we have: 


M 

nqn:M) = X(^(/^^)lT..=i}+^(/r^)lbg.=o}) 

m=l 


Thus, 

1 1 M 

il=0 iM=0 m=l = l *l:Af:*m=0 


24 











Similarly, note that the term X]ii=o ' ‘' Y1\m=o ' ^h-M specifies a joint probability distribution 
for the variables ..., with marginal probability distribution for each variable given 

by 

Combining these relations and using the subadditivity property of the Shannon entropy, we obtain 

M M 

m — 1 m — 1 = l 

Note that the numbers = V. .. u*, and = V. .. u*, are non-negative and 

sum up to 1, and thus represent a possible operating point for sensor m. Since 1 — ig 

the optimal operating point that maximizes we have 


M M 

<^*<E^(5^’”^)-E(( E E 

m—1 m—1 ii-.M-'i'm — 'i- 'ii-.M'-'im—O 

M M 

= E ^ E 

m—1 m—1 


Given , define 

M 

m—1 

Note that X^Eo ‘ ‘ ‘ X]b=o m ~ *bis is a valid joint operating point u for the multisensor 
problem. Then, 


11 11 
ip{u) = •H(E • • • E “ E ■ ■ ■ E 

ii—0 iM—0 ii—0 iM—O 

1 1 M 1 1 M 




ii—Q iM—^m—l 
M 1 


ii=0 iM—^ m—1 


M 1 




m—1 j—D 
M 1 


m—1 j—Q 
1 




m—1 
M 

= E 


j=0 


Since ip* = maxi^ p{u) < J2m=i and p{u) has a unique optimal point, selecting it as (10) 

will give us the optimal operating point for p{u) and we have p* = Ylm=i D 


Proof of Lemma 14.11 
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Proof. 

G{Pu,A,l) 


M 

=H{p^) - EY^^AH{Pn+l)+lY. W^'"Hl^^^)\{A,l),Pr,)\ 

m—1 

M 

=H{p^) - i?y„^Ji7(p„+i)|(A,0,Pn] - 7 E 

m—1 

1 1 

ii=0 iM—^ 




= ^(^ ••• ^ 9- 

2l=0 iM=0 

=G{u, 1 ) 


;1:M^*1:M 


1 1 M 

)-E-" E 

iM—0 m—1 

11 M 

) - E • • • E ^n..Mnqlj] -7 E 


2l=0 2M=0 

1 1 


2l=0 iM—0 


m—1 


□ 


Proof of Proposition 


4.2 


Proof. The optimal value function V{pj\f,N) = H{pp^) — {N — N)G* = H{p^) satisfies the hypoth¬ 
esized form at the terminal time N. We show by induction that the optimal value function has the 
postulated form, and that the optimal strategies in the theorem achieve the infimum in Bellman’s 
equation: 


M 

V{pn,n)= inf SY„+i[nPn+i,n + l) + 7E^^"^(4+l)l(^^ 

^n + ljin+l 


n+li ^n+l); 


Pn] 


m=l 


Assuming that V{pn+i, n+ 1) = H{pn-\-i) + {N — n — 1)G*, we have 


M 


£;Y.^J^(Pn+i,n+1)+7 E 


M 


= [i7(p„+i) + {N-n- 1)G* + 7 E W^”^\i%)\{A^+uln+i),Pn] 


m—1 


= H{pn) - G(m„+i,Z„+ i) + {N-n- 1)G* 


by Lemma [4.1[ 

For fixed Z, G{u, 1) is strictly concave over ''' YliM=o '^h-M — ^ due to the strict concavity 

of the Shannon entropy. Thus, 

G* = inf G{u,l) 

U,l 


Furthemore, we know from the discussion after Proposition |2.1| and Proposition 3.2 that, for any 
density p,i(x) and desired joint operating point u, there exists a joint sensing mode A^+i such that 
the probabilities ri(A„+i,p„+i) = u. Thus, 


V{pn,n)= inf [i7(p„) - G(m„+i,Z„+i)- h (iV - n - 1)G*] 
= H{pn)- sup G(m„+i,Z„+i) - (iV - n - 1)G* 

An+1 ,ln + l 

= i7(p„) -{N- n)G* 
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Note that 


sup G{u, 1) = max G{u, 1) 

U,l 

is achieved at some {u*,l*) because it is the maximum of a finite number of strictly concave 
functions dehned over the compact M-dimensional simplex. Thus, the optimal strategies are given 
by any (A„+i,Z„+i) such that Z^+i = I*, and u{An+i,Pn+i) = u*. 

□ 


Proof of Proposition 4.3 


Proof. We first prove that G* < 

11 11 M 

<:m • £m )- E ■ ■ ■ E -7E 

ii—O iM—0 


2l=0 


M 

s!:[«(( i: <:j/r‘"’’’+( e '"’•>) 


m—1 = l 

* 

Ua 








ii-.M'-im—O 


ra—1 

M 

_ ^ ^ ^(^)* 


The first inequality results from the subadditivity and additivity properties of the Shannon 
entropy. The second inequality is true because Z^'”^*), m = 1,..., M, are the optimal points 

of m = 1,... ,M, respectively. 

Let 


M 


K.m= Z = (Z(i)*,...,Z(^>) 


m—1 
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Plug them into G(n,Z), we have 


11 _ 1 1 _ M 

G{uj) ^ - E • ■ ■ E - 7 E 

2l=0 iM—O 2l=0 iM—0 

M 1 


771=1 


771=1 ^=0 

M 1 


771 = 1 j — 0 771=1 

M 1 

E ['^(E 

771 = 1 j—0 

1 


i=o 

M 


_ ^ ^ (^('^)* 

771 = 1 

Since G* = max^^j ,i)G{u,l) < Yl^=iG^^'l*, selecting {u,l) as ( |12[ ) will give us an optimal 
operating point for G{u, 1) and we have G* = XlEi G^'^'>*. □ 


References 

[1] B. O. Koopman, “Search and Screening. Operations Evaluation Group Report No. 56,” Center 
for Naval Analyses, Alexandria, VA, Tech. Rep., 1946. 

[2] -, Search and Screening: General Principles with Historical Applications. Pergamon, New 

York NY, 1980. 

[3] L. D. Stone, Theory of optimal search. Academic Press New York, 1975. 

[4] -, “Search theory: a mathematical theory for finding lost objects,” Mathematics Magazine, 

pp. 248-256, 1977. 

[5] S. J. Benkoski, M. G. Monticino, and J. R. Weisinger, “A survey of the search theory litera¬ 
ture,” Naval Research Logistics, vol. 38, no. 4, pp. 469-494, 1991. 

[6] D. A. Castahon, “Optimal search strategies in dynamic hypothesis testing,” IEEE Transactions 
on Systems, Man and Cybernetics, vol. 25, no. 7, pp. 1130-1138, Jul 1995. 

[7] A. O. Hero, D. A. Castahon, D. Cochran, and K. Kastella, Foundations and Applications of 
Sensor Management. Springer, 2008. 

[8] R. Castro and R. Nowak, “Active learning and sampling,” in Foundations and Applications of 
Sensor Management, A. Hero, D. A. Castahon, D. Cochran, and K. Kastella, Eds. Springer, 
2008. 

[9] R. Sznitman and B. Jedynak, “Active testing for face detection and localization,” IEEE Trans¬ 
actions on Pattern Analysis and Machine Intelligence, 2010. 


28 






[10] D. A. Castanon, “Stochastic control bounds on sensor network performance,” in Proc. IEEE 
Conference on Decision and Control and European Control Conference, Seville, Spain, 2005, 
pp. 4939-4944. 

[11] D. C. Hitchings and D. A. Castanon, “Receding horizon stochastic control algorithms for sensor 
management,” Proc. American Control Conference, June 2010. 

[12] K. L. Jenkins and D. A. Castanon, “Information-based adaptive sensor management for sensor 
networks,” Proc. American Control Conference, June 2011. 

[13] D. C. Hitchings and D. A. Castanon, “Sensor control for search and identification of markov 
objects,” Proc. IEEE Conference on Decision and Control and European Control Conference, 
December 2011. 

[14] J. L. Williams, J. W. Fisher III, and A. S. Willsky, “Approximate dynamic programming 
for communication-constrained sensor network management,” IEEE Transactions on Signal 
Processing, 2007. 

[15] C. Kreucher, A. O. Hero HI, K. Kastella, and M. R. Morelande, “An information-based ap¬ 
proach to sensor management in large dynamic networks,” Proceedings of IEEE, 2007. 

[16] C. Kreucher, K. Kastella, and A. O. Hero HI, “Sensor management using an active sensing 
approach,” Signal Processing, vol. 85, no. 3, pp. 607-624, 2005. 

[17] D. P. Bertsekas, Dynamic Programming and Optimal Control, 3rd ed. Athena Scientific, 2005, 
vol. 1. 

[18] M. H. DeCroot, Optimal Statistical Decisions. McCraw Hill, 1970. 

[19] G. B. Wetherill and K. D. Glazebrook, Sequential Methods in Statistics, 3rd ed., ser. Mono¬ 
graphs on Statistics and Applied Probability. Chapman & Hall, 1986. 

[20] H. Robbins, “Some aspects of the sequential design of experiments,” Bulletin of the American 
Mathematical Society, 1952. 

[21] B. Jedynak, P. I. Frazier, and R. Sznitman, “Twenty questions with noise: Bayes optimal 
policies for entropy loss,” Journal of Applied Probability, vol. 49, pp. 114-136, 2011. 

[22] T. M. Cover and J. A. Thomas, Elements of information theory. John Wiley &: Sons, 2012. 

[23] M. Horstein, “Sequential transmission using noiseless feedback,” IEEE Transactions on Infor¬ 
mation Theory, vol. 9, no. 3, pp. 136-143, 1963. 

[24] M. V. Burnashev and K. Zigangirov, “An interval estimation problem for controlled observa¬ 
tions,” Problemy Peredachi Informatsii, vol. 10, no. 3, pp. 51-61, 1974. 

[25] R. Nowak, “Generalized binary search,” in Proc. Allerton Conference Communication, Control, 
and Computing, Monticello, IL, 2008, pp. 568-574. 

[26] A. Dhagat, P. Gacs, and P. Winkler, “On playing “twenty questions” with a liar,” in Proc. 
third annual ACM-SIAM symposium on Discrete algorithms, 1992, pp. 16-22. 

[27] J. Spencer, “Ulam’s searching game with a fixed number of lies,” Theoretical Computer Science, 
vol. 95, no. 2, pp. 307-321, 1992. 


29 



[28] R. Sznitman, A. Lucchi, P. I. Frazier, B. Jedynak, and P. Fua, “An optimal policy for tar¬ 
get localization with application to electron microscopy,” Proc. International Conference on 
Machine Learning, 2013. 

[29] T. Tsiligkaridis, B. M. Sadler, and A. O. Hero III, “Collaborative 20 questions for target 
localization,” IEEE Transactions on Information Theory, vol. 60, no. 4, pp. 2233-2252, 2014. 

[30] -, “On decentralized estimation with active queries,” IEEE Transactions on Signal Pro¬ 

cessing, vol. 63, no. 10, pp. 2610-2622, 2015. 

[31] H. Ding and D. A. Castahon, “Optimal solutions for classes of adaptive search problems,” in 
Proc. IEEE Conference on Decison and Control, Osaka, Japan, December 2015. 

[32] D. P. Bertsekas and S. E. Shreve, Stochastic Optimal Control: the Discrete Time Case. Or¬ 
lando: Academic Press, 1978. 

[33] P.-N. Chen and F. Alajaji, Lectures Notes in Information Theory, 2000. 

[34] R. G. Gallager, Information Theory and Reliable Communication. New York: John Wiley, 
1968. 

[35] P.-O. Amblard, O. J. Michel, and S. Morfu, “Revisiting the asymmetric binary channel: joint 
noise-enhanced detection and information transmission through threshold devices,” in Proc. 
SPIE 5845 Noise in Complex Systems and Stochastic Dynamics III, May 2005. 


30 



